-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
One zarr to rule them all #12
Conversation
- improved dask SLURMCluster setup, since data-retrieval is massive
set time chunking to roughly 250MB chunksize appending in pieces to zarr to prevent timeout of dask client
plotting with rank zero only
Okay this PR is ready for review and merge. I have copied @cosunae latest single zarr to balfrin.cscs.ch and updated the code accordingly. All zarr-creation scripts are now removed from the repo. The new example file is trained on a few timesteps just for debugging. I suggest to do the following before merge:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code is functional for the inference and the prediction plotting
Pinning pytorch to a specific version was required because otherwise on some systems cpu variants would be installed
This PR simplifies the dataloader and input-data creation.
New features
create_zarr_archive.py
now generates one huge zarr-archive directly from the COSMO-2 GRIB-files. @cosunae this might be of interest for offline data preparation.weather_dataset.py
now loads one zarr archive and__get_item__
now also returns the datetimes of the current batchar_model.py
the datetimes of the batch are compared toconstants.EVAL_DATETIMES
. Massively simplifying model testing and prediction @clechartre this is certainly relevant for prediction/verification.Reasoning
constants.STORE_EXAMPLE_DATA
.Notes
example.ckpt
was uploaded for evaluation/testing (trained on very small dataset)